{
    "cells": [
        {
            "cell_type": "markdown",
            "id": "cb8d5d8fcf9e2584",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "<a href=\"https://colab.research.google.com/github/milvus-io/bootcamp/blob/master/bootcamp/tutorials/integration/build_RAG_from_s3_with_milvus.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>   <a href=\"https://github.com/milvus-io/bootcamp/blob/master/bootcamp/tutorials/integration/build_RAG_from_s3_with_milvus.ipynb\" target=\"_blank\">\n",
                "    <img src=\"https://img.shields.io/badge/View%20on%20GitHub-555555?style=flat&logo=github&logoColor=white\" alt=\"GitHub Repository\"/>\n",
                "    "
            ]
        },
        {
            "cell_type": "markdown",
            "id": "411413c79fca450f",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "# Building a RAG Pipeline: Loading Data from S3 into Milvus\n",
                "\n",
                "This tutorial walks you through the process of building a Retrieval-Augmented Generation (RAG) pipeline using Milvus and Amazon S3. You will learn how to efficiently load documents from a S3 bucket, split them into manageable chunks, and store their vector embeddings in Milvus for fast and scalable retrieval. To streamline this process, we will use LangChain as a tool to load data from S3 and facilitate its storage in Milvus.\n",
                "\n",
                "## Preparation\n",
                "### Dependencies and Environment"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 1,
            "id": "ce864e8ce73f06c6",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-02-14T05:08:02.458344Z",
                    "start_time": "2025-02-14T05:08:02.450193Z"
                },
                "collapsed": false
            },
            "outputs": [],
            "source": [
                "! pip install --upgrade --quiet pymilvus openai requests tqdm boto3 langchain langchain-core langchain-community langchain-text-splitters langchain-milvus langchain-openai bs4"
            ]
        },
        {
            "cell_type": "markdown",
            "id": "a41e58bcf224c82e",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "> If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the \"Runtime\" menu at the top of the screen, and select \"Restart session\" from the dropdown menu)."
            ]
        },
        {
            "cell_type": "markdown",
            "id": "b10a38c11b9d203f",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "We will use OpenAI as the LLM in this example. You should prepare the [api key](https://platform.openai.com/docs/quickstart) `OPENAI_API_KEY` as an environment variable."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 44,
            "id": "ac00e805d18729c7",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:37:26.396451Z",
                    "start_time": "2025-01-28T20:37:26.394704Z"
                },
                "collapsed": false
            },
            "outputs": [],
            "source": [
                "import os\n",
                "\n",
                "os.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\""
            ]
        },
        {
            "cell_type": "markdown",
            "id": "8a060bc196b531a7",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "## S3 Configuration\n",
                "\n",
                "For loading documents from S3, you need the following:\n",
                "\n",
                "1. **AWS Access Key and Secret Key**: Store these as environment variables to securely access your S3 bucket:"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 45,
            "id": "759bc61d2aa40811",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:37:27.702707Z",
                    "start_time": "2025-01-28T20:37:27.699307Z"
                },
                "collapsed": false
            },
            "outputs": [],
            "source": [
                "os.environ[\"AWS_ACCESS_KEY_ID\"] = \"your-aws-access-key-id\"\n",
                "os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"your-aws-secret-access-key\""
            ]
        },
        {
            "cell_type": "markdown",
            "id": "69c20cdf71a5955b",
            "metadata": {
                "collapsed": false
            },
            "source": []
        },
        {
            "cell_type": "markdown",
            "id": "fb05c7003a0d7828",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "2. **S3 Bucket and Document**: Specify the bucket name and document name as arguments to the `S3FileLoader` class."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 46,
            "id": "db3a6a3b027d399d",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:37:29.040729Z",
                    "start_time": "2025-01-28T20:37:29.034450Z"
                },
                "collapsed": false
            },
            "outputs": [],
            "source": [
                "from langchain_community.document_loaders import S3FileLoader\n",
                "\n",
                "loader = S3FileLoader(\n",
                "    bucket=\"milvus-s3-example\",  # Replace with your S3 bucket name\n",
                "    key=\"WhatIsMilvus.docx\",  # Replace with your document file name\n",
                "    aws_access_key_id=os.environ[\"AWS_ACCESS_KEY_ID\"],\n",
                "    aws_secret_access_key=os.environ[\"AWS_SECRET_ACCESS_KEY\"],\n",
                ")"
            ]
        },
        {
            "cell_type": "markdown",
            "id": "a3ad07248d84f0c1",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "3. **Load Documents**: Once configured, you can load the document from S3 into your pipeline:"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 47,
            "id": "d68aa6869b80abb5",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:37:31.244641Z",
                    "start_time": "2025-01-28T20:37:30.419649Z"
                },
                "collapsed": false
            },
            "outputs": [],
            "source": [
                "documents = loader.load()"
            ]
        },
        {
            "cell_type": "markdown",
            "id": "d7c25eeef0c0a286",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "This step ensures that your documents are successfully loaded from S3 and ready for processing in the RAG pipeline."
            ]
        },
        {
            "cell_type": "markdown",
            "id": "137452b3d5b6cd2f",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "## Split Documents into Chunks\n",
                "\n",
                "After loading the document, use LangChain's `RecursiveCharacterTextSplitter` to break the content into manageable chunks:"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 49,
            "id": "debf1df0104ded88",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:41:15.700134Z",
                    "start_time": "2025-01-28T20:41:15.694833Z"
                },
                "collapsed": false
            },
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "Document(metadata={'source': 's3://milvus-s3-example/WhatIsMilvus.docx'}, page_content='Milvus offers three deployment modes, covering a wide range of data scales—from local prototyping in Jupyter Notebooks to massive Kubernetes clusters managing tens of billions of vectors: \\n\\nMilvus Lite is a Python library that can be easily integrated into your applications. As a lightweight version of Milvus, it’s ideal for quick prototyping in Jupyter Notebooks or running on edge devices with limited resources. Learn more.\\nMilvus Standalone is a single-machine server deployment, with all components bundled into a single Docker image for convenient deployment. Learn more.\\nMilvus Distributed can be deployed on Kubernetes clusters, featuring a cloud-native architecture designed for billion-scale or even larger scenarios. This architecture ensures redundancy in critical components. Learn more. \\n\\nWhat Makes Milvus so Fast\\U0010fc00 \\n\\nMilvus was designed from day one to be a highly efficient vector database system. In most cases, Milvus outperforms other vector databases by 2-5x (see the VectorDBBench results). This high performance is the result of several key design decisions: \\n\\nHardware-aware Optimization: To accommodate Milvus in various hardware environments, we have optimized its performance specifically for many hardware architectures and platforms, including AVX512, SIMD, GPUs, and NVMe SSD. \\n\\nAdvanced Search Algorithms: Milvus supports a wide range of in-memory and on-disk indexing/search algorithms, including IVF, HNSW, DiskANN, and more, all of which have been deeply optimized. Compared to popular implementations like FAISS and HNSWLib, Milvus delivers 30%-70% better performance.')"
                        ]
                    },
                    "execution_count": 49,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
                "\n",
                "# Initialize a RecursiveCharacterTextSplitter for splitting text into chunks\n",
                "text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)\n",
                "\n",
                "# Split the documents into chunks using the text_splitter\n",
                "docs = text_splitter.split_documents(documents)\n",
                "\n",
                "# Let's take a look at the first document\n",
                "docs[1]"
            ]
        },
        {
            "cell_type": "markdown",
            "id": "84f40dfe7468f064",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "At this stage, your documents are loaded from S3, split into smaller chunks, and ready for further processing in the Retrieval-Augmented Generation (RAG) pipeline."
            ]
        },
        {
            "cell_type": "markdown",
            "id": "828f6c4f575558d5",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "## Build RAG chain with Milvus Vector Store\n",
                "\n",
                "We will initialize a Milvus vector store with the documents, which load the documents into the Milvus vector store and build an index under the hood."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 50,
            "id": "95081685df72442f",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:44:06.863515Z",
                    "start_time": "2025-01-28T20:44:05.389274Z"
                },
                "collapsed": false
            },
            "outputs": [],
            "source": [
                "from langchain_milvus import Milvus\n",
                "from langchain_openai import OpenAIEmbeddings\n",
                "\n",
                "embeddings = OpenAIEmbeddings()\n",
                "\n",
                "vectorstore = Milvus.from_documents(\n",
                "    documents=docs,\n",
                "    embedding=embeddings,\n",
                "    connection_args={\n",
                "        \"uri\": \"./milvus_demo.db\",\n",
                "    },\n",
                "    drop_old=True,  # Drop the old Milvus collection if it exists\n",
                ")"
            ]
        },
        {
            "cell_type": "markdown",
            "id": "6bb309adec400b01",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "> For the `connection_args`:\n",
                "> - Setting the `uri` as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file.\n",
                "> - If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `uri`.\n",
                "> - If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, please adjust the `uri` and `token`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud."
            ]
        },
        {
            "cell_type": "markdown",
            "id": "b1120a3eba563638",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "Search the documents in the Milvus vector store using a test query question. Let’s take a look at the top 1 document."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 51,
            "id": "89a10577c5b57389",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:44:23.588504Z",
                    "start_time": "2025-01-28T20:44:23.130103Z"
                },
                "collapsed": false
            },
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "[Document(metadata={'pk': 455631712233193487, 'source': 's3://milvus-s3-example/WhatIsMilvus.docx'}, page_content='Milvus offers three deployment modes, covering a wide range of data scales—from local prototyping in Jupyter Notebooks to massive Kubernetes clusters managing tens of billions of vectors: \\n\\nMilvus Lite is a Python library that can be easily integrated into your applications. As a lightweight version of Milvus, it’s ideal for quick prototyping in Jupyter Notebooks or running on edge devices with limited resources. Learn more.\\nMilvus Standalone is a single-machine server deployment, with all components bundled into a single Docker image for convenient deployment. Learn more.\\nMilvus Distributed can be deployed on Kubernetes clusters, featuring a cloud-native architecture designed for billion-scale or even larger scenarios. This architecture ensures redundancy in critical components. Learn more. \\n\\nWhat Makes Milvus so Fast\\U0010fc00 \\n\\nMilvus was designed from day one to be a highly efficient vector database system. In most cases, Milvus outperforms other vector databases by 2-5x (see the VectorDBBench results). This high performance is the result of several key design decisions: \\n\\nHardware-aware Optimization: To accommodate Milvus in various hardware environments, we have optimized its performance specifically for many hardware architectures and platforms, including AVX512, SIMD, GPUs, and NVMe SSD. \\n\\nAdvanced Search Algorithms: Milvus supports a wide range of in-memory and on-disk indexing/search algorithms, including IVF, HNSW, DiskANN, and more, all of which have been deeply optimized. Compared to popular implementations like FAISS and HNSWLib, Milvus delivers 30%-70% better performance.')]"
                        ]
                    },
                    "execution_count": 51,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "query = \"How can Milvus be deployed\"\n",
                "vectorstore.similarity_search(query, k=1)"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 52,
            "id": "4c3c10ccfaa164be",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:46:59.009945Z",
                    "start_time": "2025-01-28T20:46:58.964037Z"
                },
                "collapsed": false
            },
            "outputs": [],
            "source": [
                "from langchain_core.runnables import RunnablePassthrough\n",
                "from langchain_core.prompts import PromptTemplate\n",
                "from langchain_core.output_parsers import StrOutputParser\n",
                "from langchain_openai import ChatOpenAI\n",
                "\n",
                "# Initialize the OpenAI language model for response generation\n",
                "llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n",
                "\n",
                "# Define the prompt template for generating AI responses\n",
                "PROMPT_TEMPLATE = \"\"\"\n",
                "Human: You are an AI assistant, and provides answers to questions by using fact based and statistical information when possible.\n",
                "Use the following pieces of information to provide a concise answer to the question enclosed in <question> tags.\n",
                "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
                "<context>\n",
                "{context}\n",
                "</context>\n",
                "\n",
                "<question>\n",
                "{question}\n",
                "</question>\n",
                "\n",
                "The response should be specific and use statistics or numbers when possible.\n",
                "\n",
                "Assistant:\"\"\"\n",
                "\n",
                "# Create a PromptTemplate instance with the defined template and input variables\n",
                "prompt = PromptTemplate(\n",
                "    template=PROMPT_TEMPLATE, input_variables=[\"context\", \"question\"]\n",
                ")\n",
                "# Convert the vector store to a retriever\n",
                "retriever = vectorstore.as_retriever()\n",
                "\n",
                "\n",
                "# Define a function to format the retrieved documents\n",
                "def format_docs(docs):\n",
                "    return \"\\n\\n\".join(doc.page_content for doc in docs)"
            ]
        },
        {
            "cell_type": "markdown",
            "id": "a91c0e3d1112f76b",
            "metadata": {
                "collapsed": false
            },
            "source": [
                "Use the LCEL(LangChain Expression Language) to build a RAG chain."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 53,
            "id": "b946496dc2869bc2",
            "metadata": {
                "ExecuteTime": {
                    "end_time": "2025-01-28T20:47:01.970347Z",
                    "start_time": "2025-01-28T20:47:00.032717Z"
                },
                "collapsed": false
            },
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "'Milvus can be deployed in three different modes: Milvus Lite for local prototyping and edge devices, Milvus Standalone for single-machine server deployment, and Milvus Distributed for deployment on Kubernetes clusters. These deployment modes cover a wide range of data scales, from small-scale prototyping to massive clusters managing tens of billions of vectors.'"
                        ]
                    },
                    "execution_count": 53,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "rag_chain = (\n",
                "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
                "    | prompt\n",
                "    | llm\n",
                "    | StrOutputParser()\n",
                ")\n",
                "\n",
                "\n",
                "res = rag_chain.invoke(query)\n",
                "res"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "id": "ae2a9f5b3224807f",
            "metadata": {
                "collapsed": false
            },
            "outputs": [],
            "source": []
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "Python 3 (ipykernel)",
            "language": "python",
            "name": "python3"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.11.5"
        }
    },
    "nbformat": 4,
    "nbformat_minor": 5
}
